Bootstrap Model Aggregation for Distributed Statistical Learning
نویسندگان
چکیده
In distributed, or privacy-preserving learning, we are often given a set of probabilistic models estimated from different local repositories, and asked to combine them into a single model that gives efficient statistical estimation. A simple method is to linearly average the parameters of the local models, which, however, tends to be degenerate or not applicable on non-convex models, or models with different parameter dimensions. One more practical strategy is to generate bootstrap samples from the local models, and then learn a joint model based on the combined bootstrap set. Unfortunately, the bootstrap procedure introduces additional noise and can significantly deteriorate the performance. In this work, we propose two variance reduction methods to correct the bootstrap noise, including a weighted M-estimator that is both statistically efficient and practically powerful. Both theoretical and empirical analysis is provided to demonstrate our methods.
منابع مشابه
Estimation in Simple Step-Stress Model for the Marshall-Olkin Generalized Exponential Distribution under Type-I Censoring
This paper considers the simple step-stress model from the Marshall-Olkin generalized exponential distribution when there is time constraint on the duration of the experiment. The maximum likelihood equations for estimating the parameters assuming a cumulative exposure model with lifetimes as the distributed Marshall Olkin generalized exponential are derived. The likelihood equations do not lea...
متن کاملDragging: Density-Ratio Bagging
We propose density-ratio bagging (dragging), a semi-supervised extension of bootstrap aggregation (bagging) method. Additional unlabeled training data are used to calculate the weight on each labeled training point by a density-ratio estimator. The weight is then used to construct a weighted labeled empirical distribution, from which bags of bootstrap samples are drawn. Asymptotically, dragging...
متن کاملDistributed learning with bagging - like performance 3
10 Bagging forms a committee of classifiers by bootstrap aggregation of training sets from a pool of training data. A 11 simple alternative to bagging is to partition the data into disjoint subsets. Experiments with decision tree and neural 12 network classifiers on various datasets show that, given the same size partitions and bags, disjoint partitions result in 13 performance equivalent to, o...
متن کاملDistributed learning with bagging-like performance
Bagging forms a committee of classifiers by bootstrap aggregation of training sets from a pool of training data. A simple alternative to bagging is to partition the data into disjoint subsets. Experiments with decision tree and neural network classifiers on various datasets show that, given the same size partitions and bags, disjoint partitions result in performance equivalent to, or better tha...
متن کاملBagged Structure Learning of Bayesian Network
We present a novel approach for density estimation using Bayesian networks when faced with scarce and partially observed data. Our approach relies on Efron’s bootstrap framework, and replaces the standard model selection score by a bootstrap aggregation objective aimed at sifting out bad decisions during the learning procedure. Unlike previous bootstrap or MCMC based approaches that are only ai...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016